Data topology visualization for the Self-Organizing Map
نویسندگان
چکیده
The Self-Organizing map (SOM), a powerful method for data mining and cluster extraction, is very useful for processing data of high dimensionality and complexity. Visualization methods present different aspects of the information learned by the SOM to gain insight and guide segmentation of the data. In this work, we propose a new visualization scheme that represents data topology superimposed on the SOM grid, and we show how it helps in the discovery of data structure. 1 Visualization of SOM knowledge The Self-Organizing Map (SOM) [1] is a widely and successfully used neural paradigm for clustering and data mining. Informative representation of the learned SOM’s knowledge greatly aids precise capture of the cluster boundaries. This is especially important for high-dimensional and large data sets with many meaningful clusters such as in remote sensing or medical imagery, which often also have interesting rare clusters to be discovered. An impressive suite of previous works include the U-matrix [2] and its variants, which are useful when relatively large SOM grid accomodates small data sets with a low number of clusters (e.g., [3], [4], [5]) but, because of averaging of prototype distances over neighbours or thresholding, they tend to miss finer structure in complicated data [6]. Unique approaches such as [7] and gravitational methods (e.g., Adaptive Coordinates [4]) visualize distances between receptive field centres in innovative ways that greatly help manual cluster extraction. Experiments with automated colour assignments aim at qualitative exploration of the approximate cluster structure [8], [9], [10]. We point the reader to [4], [11] for more review. Some earlier methods use the size of the receptive fields of the prototypes for visualization (e.g., [5], [9]), but none exploit the data topology. Visualization of the mapping of samples, adjacent in data space, to different SOM prototypes is useful when prototypes outnumber data samples [12]. When data samples are plenty, adjacent samples mapped to different prototypes are only the ones at the boundaries of the Voronoi polyhedra, causing the visualization to ignore a lot of helpful mapping information. We visualize the data topology on the SOM grid, showing topology violations and effectively aiding in detailed cluster capture including fine structure in large real data with many clusters of widely varying statistics. ∗This work was partially supported by grant NNG05GA94G from the Applied Information Systems Research Program, NASA, Science Mission Directorate. Figures are in colour, request colour copy by email: [email protected], [email protected] ESANN'2006 proceedings European Symposium on Artificial Neural Networks Bruges (Belgium), 26-28 April 2006, d-side publi., ISBN 2-930307-06-4.
منابع مشابه
A combined measure for quantifying and qualifying the topology preservation of growing self-organizing maps
Keywords: Topology preserving Self-organizing map Growing cell structures Visualization methods Delaunay triangulation The Self-Organizing Map (SOM) is a neural network model that performs an ordered projection of a high dimensional input space in a low-dimensional topological structure. The process in which such mapping is formed is defined by the SOM algorithm, which is a competitive, unsuper...
متن کاملMnemonic SOMs: Recognizable Shapes for Self-Organizing Maps
The Self-Organizing Map (SOM) enjoys significant popularity in the field of data mining and visualization. While its topology-preserving mapping allows easier interpretation of complex data, communicating the location of clusters and individual data items as well as memorizing locations are not solved satisfactorily in conventional rectangular maps. In this paper, a variant of self-organizing m...
متن کاملDocument Clustering and Visualization with Latent Dirichlet Allocation and Self-Organizing Maps
Clustering and visualization of large text document collections aids in browsing, navigation, and information retrieval. We present a document clustering and visualization method based on Latent Dirichlet Allocation and self-organizing maps (LDA-SOM). LDA-SOM clusters documents based on topical content and renders clusters in an intuitive twodimensional format. Document topics are inferred usin...
متن کاملInformation Visualization with Self-Organizing Maps
The Self-Organizing Map (SOM) is an unsupervised neural network algorithm that projects highdimensional data onto a two-dimensional map. The projection preserves the topology of the data so that similar data items will be mapped to nearby locations on the map. Despite the popular use of the algorithm for clustering and information visualisation, a system has been lacking that combines the fast ...
متن کاملClustering Quality and Topology Preservation in Fast Learning SOMs
The Self-Organizing Map (SOM ) is a popular unsupervised neural network able to provide effective clustering and data visualization for data represented in multidimensional input spaces. In this paper we describe Fast Learning SOM (FLSOM ) which adopts a learning algorithm that improves the performance of the standard SOM with respect to the convergence time in the training phase. We show that ...
متن کاملAdvanced Visualization Techniques for Self-organizing Maps with Graph-Based Methods
The Self-Organizing Map is a popular neural network model for data analysis, for which a wide variety of visualization techniques exists. We present a novel technique that takes the density of the data into account. Our method defines graphs resulting from nearest neighborand radius-based distance calculations in data space and shows projections of these graph structures on the map. It can then...
متن کامل